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Abstract 



Aleliunas et al. [1] posed the following question: "The reachability problem 
for undirected graphs can be solved in logspace and O (inn) time [m is the number 
of edges and n is the number of vertices] by a probabilistic algorithm that simulates 
a random walk, or in linear time and space by a conventional deterministic graph 
traversal algorithm. Is there a spectrum of time-space trade-offs between these 
extremes?" We answer this question in the affirmative for graphs with a linear 
number of edges by presenting an algorithm that is faster than the random walk by 
a factor essentially proportional to the size of its workspace. For denser graphs, 
our algorithm is faster than the random walk but the speed-up factor is smaller. 
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1 Motivation and Results 



We consider the problem of s-t connectivity on an undirected graph (USTCON). Given 
a graph G with n vertices and m edges, and given two vertices s and t of G, we 
are to decide if s and t are in the same connected component. We are interested in 
space-bounded algorithms for USTCON, which is an important problem in the study of 
space-bounded complexity classes [3, 9]. Throughout this paper, we assume that our 
workspace takes the form of p registers, each capable of storing a log « -bit number. 

There are two well-known approaches to solving USTCON: via a deterministic graph 
search on G (e.g., depth-first search) and via a simulation of a random walk on G [1]. 
(The standard random walk on G is the stochastic process associated with a particle 
moving from vertex to vertex according to the following rule: the probability of a 
transition from vertex i, of degree J, , to vertex j is 1 /J, if {i, 7 } is an edge in G and 0 
otherwise.) 

The first approach can be implemented to run in time O (w) using space O («). The latter 
requires space 0(1), and has been shown to decide USTCON with one-sided error in 
time Oimn) (i.e., \fs and t are in the same connected component, the algorithm outputs 
YES with probability at least 0.5; if they are in different components, the algorithm 
outputs NO). For both these algorithms, the product of time and space is 0{mn). 

Given space that is insufficient for depth-first search, can we decide USTCON faster 
than via a random walk? More precisely, given space p < n, can we bridge the gap 
between the depth- first search and the random walk by devising an algorithm that runs 
in time 0{mn/p)l Considering the time-space product achieved at the two extremes, 
this seems a likely conjecture. 

In this paper we present an algorithm that runs in time 0{np- log^ n/p). Therefore, for 
linear-sized graphs (i.e., m = 0{n)), it achieves the bound conjectured above within a 

poly-log factor. For denser graphs, our algorithm does not achieve the bound; but it is 
faster than the random walk, at least, once p exceeds the average degree. 

The basic idea of the new algorithm is to simulate a graph search, but only on a subset 
of p vertices chosen independently at random according to the stationary distribution 

of the random walk, together with the vertices s and t . (The stationary distribution of 
the random walk is 7r„ — di,/{2m) where dy is the degree of vertex v.) 

We refer to the p randomly chosen vertices as leaders. A single step in graph search 
is replaced by a random walk of an appropriate length. Assuming that the graph 
is connected, we show that for a certain constant ki, a set of p walks of length 
ri = kiirP' In^ n/ p^, one from each leader, will visit every vertex in the graph with 
high probability, and furthermore the walk from each leader reaches some other leader 
thus proving that the two leaders are in the same component. With high probabihty all 
leaders are proven connected within O(logn) trial walks from each leader. 

In order to deal with the case when the graph is composed of several coimected 
components we repeat the procedure above 0(logn) times with independent choices 
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of leaders and also add random walks from s and /. (See section 3 for a pseudo- 
algol description.) We show that with high probability, at least one choice results in 
sufficiently many leaders in the component of interest (which contains both s and f ) to 
ensure the success of the method. Thus we have an algorithm for USTCON with an 
overall running time of O (m^ log^ n/p). Notice that this algorithm resembles standard 
search when p — n and the random walk when p = 0. (However, throughout this 
paper we shall assume p > 0.) 

There are three facts that must be proven in order to show that this algorithm works. 
The first is to show that a set of p random walks of length ti, one from each of 
the randomly chosen leaders, visits all the vertices of a connected graph with high 
probability. Otherwise an adversary could choose s and f among those vertices unlikely 
to be visited from the leaders and conceivably foil the algorithm. In other words, we need 
to derive a bound on the expected time required by p parallel and independent random 
walks to cover the graph, a problem of interest in its own right. Typically, results about 
graph coverage rely heavily on the long-run behavior of the corresponding Markov 
chain and its convergence to a limit distribution. Here we must prove something about 
short-term behavior of the Markov chain and coverage of local neighborhoods in a 
graph. 

The second fact to prove is that ifs and t are in the same component then they are linked 
up through the leaders in a small number of trials from each leader if enough leaders 
are chosen within the component. Coverage of the graph as described above does not 
suffice to prove this because s and t may be visited by different walks. Indeed, all the 
vertices in G could be visited by the walks even with s and t in different components. 

The third fact is to show that, with high probability, within O (log n) choices of the set 
of leaders, the component containing s and t gets enough leaders at least once. 

To aid the intuition of the reader, let us consider the case when G is a simple path 

on n vertices. For p leaders chosen at random, the maximum gap between two 
leaders is no more than nlnti/ p with high probability; the expected time to cover 
this maximum gap is © (n^ log^ n / p^) . Hence O (log n) trials (random walks of length 
O (n^ log^ n I p^) from each leader) will almost surely cover all the gaps between them 
for a total of ©(n^ log^ n/p) steps. Extending this technique to even 3-regular graphs 
requires considerably more complicated machinery and the general bound is weaker. (In 
particular, the walks need to have length 0{n^ log^ n/ p^) and we need to try (9(logn) 
choices for leaders.) 

Our main results are: 

Theorem 1 Let G be a connected, undirected graph with n vertices and m edges. 
Let Lbea subset of p vertices chosen at random according to the stationary distribution. 
Let Sy (t) denote the set of vertices seen in a random walk of length t starting at v. The 
random variable Cp is defined by 

Cp=mf{t : \JSi{t) = V}, 

l€L 
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that is Cp is the time needed for p parallel random walks to visit all the vertices in the 
graph. Then 



Theorem 2 There is an algorithm that, given an undirected graph G with n vertices, 
m edges, and given two vertices s and t of G, decides USTCON with one-sided error 
using space p and time 0{m^ log" n/ p). If s and t are in the same connected component, 
the algorithm outputs YES with probability 1 — 0{n~^), otherwise it outputs NO. 



• The upper bound on the parallel cover time given in Theorem 1 is an overestimate 
by at most an O(logn) factor, at least for linear-size graphs. This is easily seen 
from the path graph example. 

• The algorithm mentioned in Theorem 2 runs in time that is within a log^ n factor 
of our target time- bound of 0{mn/ p) for linear-sized graphs. The polylog factor 
arises from less than optimal bounds used in the analysis of our probabilistic 
algorithm. However, the case of the path graph considered above shows that for 
our algorithm this factor is at least log^ n. 

2 Covering a Graph with p random walks 

In this section we derive an upper bound on the time taken by p parallel and independent 
walks to cover the graph (Theorem 1). 

We denote by {v, w] the undirected edge between vertices v and w and by [v, w] its 
directed version. For the purposes of the proof, we need to look at the random walk 
in two ways: first, as a Markov chain X, where each state is a vertex in G (the vertex 
process); second, as a Markov chain 7, where each state is a directed edge (the edge 
process). The transition rule for the vertex process is that if X, — v, then is 
equally likely to be any of the neighbors of vertex v. The edge process is defined by 
Yt = [Xt-i, Xt], t > I. The stationary distribution of the vertex process, denoted n, 
is given by 7r„ — dy/(2m) where dy is the degree of the vertex v, and the stationary 
distribution of the edge process, denoted jr', is given by jr^^ = l/(2m). 

Let Nt,(u, T) (respectively Nt,([u, w], T)) be the number of visits to the vertex u 
(respectively traversals of [u, w]) in a random walk of length T starting at v. Let Sy{T) 
(respectively Ey{T)) be the set of vertices (edges) visited in a random walk of length 
T starting at v. Finally, let Hy{u) (respectively Hy{\u, w]) be the first time the vertex 
u (the edge [w, w]) is encountered by a random walk starting from v. For all of these 
random variables, a replacement of the subscript v with the subscript n (respectively 
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[v, w]) denotes a random walk starting at the stationary distribution (respectively the 
directed edge [v, w]). 



Lemma 3 Let G be a connected, undirected graph on n vertices. Consider a 
random walk of length x starting from the stationary distribution. Then for every 
directed edge \v, w\ 

. , E(NA{v,w\,r)) 

VrUv, «;] e £„(t)) > \ ^— 

^ ^ l+£(%,^]([i;,u;],r)j 

Proof: Clearly 

^(nAVv, wl r)) = Pr(^'^([^' w']) = (l + e(%,„]([i;, w], r - /))) 

l<f<T 

< Pr(//, ([i;, w]) < r) (l + e(%,„] ([i;, w], r))) . 

ButPr^//;r([u, id]) < 1^) = Pr([i'. "^l e S'itCt)^, yielding the lemma. □ 

Lemma 4 Le/ G be a connected, undirected graph with n vertices and m edges. 
Then, for every directed edge \v, w\ 

Ei^N[y_w](\.v, w], x)j < ^ +k2^/x\nn, 

where k2 is an absolute constant. 

Proof: We consider the edge process F,. From standard results in renewal theory [8] 
we obtain that 

e(a'i„,„]([i;, w], t)) = nl^^^^(r + E(//y,„„,(,)([i;, w]))). (1) 

Clearly 

^{nY,„,^iMilv, w])) = E(i/z„(T)(i')) + e(//„([i;, w])). (2) 

Let d(x,y)he the distance (the length of the shortest path) between two vertices x and 
y in G. Let c be a sufficiently large constant. 

We first bound E (r) (u)^ using the fact that diX^ir), w) is not likely to be more 
than cV r In n. By the law of total probability 

E(//x„{r)(l^)) = 
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J2 ^{HxMiv) I d{X^{r), V) = r) Pr(d(X^(r), i;) = r) 

0< r<cVT Inn 

+E(i/x„(T)(i') I V) > cVr Inn) (3) 

X Pr^<i(Xu,(T), d) > cVrlnn). 

Since (i(X,^(r), < 1 + (i(X,^(r), jxi), we obtain from the main result of [4] that 
Pr^i/(Z„(T), v) > cVt Inn) < Pr^i/(Z„(T), w) > cVt Inn) 

<- 

;c:d(w,;c)>cVrta7^ ^ ^ ^ ^ 

< 2n^exp(^-^— j<-3, (4) 



for a sufficiently large c. 

For any two vertices x and y in the same component we can apply the bound implicitly 
proven in [1] 

E(^HAy)) <md(x,y). (5) 
Plugging equation (5) and equation (4) in equation (3) we obtain that 

e(^//x„{t)(i')) < cmVrlnn + 1 (6) 
Turning to the second term of the right side of equation (2), we observe that 

e(^H,{[v, w])) < 2m + 1, (7) 

because the expected time to return to v given that v was left through an edge other 
than [v, w] is at most 2m/ (d^ — 1) and the expected number of returns to v before 
exiting through [d, ly] is <i„ — 1. (The former fact follows from 2m /dy = E(//„(u)) > 
(J„ - l)/<i„ • E(//„ I V not left via [v, it)]).) 

Combining equations (6), (7), and (2), we obtain that 

^{^lYi„,(r){[v, ly])) < cmVrlnn + 2m + 2. 
Finally, from equation (1), because n^^ — l/(2m) for any edge [v, w] 

e(%,„]([!;, w], t)) < ^ + ^cVrlnn + 0{l). 
From here, the Lenmia follows with an appropriate value for c. □ 
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Lemma 5 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p vertices ( called leaders) in G chosen independently according to 

the stationary distribution. For every constant c\ > 0 there exists a constant ci such 
that for every directed edge \v,vS\a set of p walks of length C2m^ In^ n/ p^, one from 
each of the leaders, satisfies 

Pr([i;, u)] e 1 J E/fem^ln^n//)) > 1 - — . 
Proof: For p = O (log n) the conclusion is obvious. For larger p we start from 

l€L l€L 

and, since each vertex / is chosen according to the stationary distribution, Lemma 3 
gives us a bound on Pr([u, w] ^ Ei{t)). ByLemma4andbecauseE^A';r([u, w], T)j = 
T/2m, there exists a constant C3 > 0 such that 

Pr([v, ^ U ^'(^)) ^ (l - ' 
^ leL ^ mVlnn/ 

provided that r = O (m^ log n) . Now taking r = cim^ In^ n/p^ yields the result. □ 

Theorem 6 Let G = (V, E) be a connected, undirected graph with n vertices and 

m edges. Let L be a subset of p vertices chosen at random according to the stationary 
distribution. Let Sy(t) denote the set of vertices seen in a random walk of length t 
starting at v. The random variable Cp is defined by 

Cp = inf{f : \JSi(t)=V}, 

l€L 

that is Cp is the time needed for p parallel random walks to visit all the vertices in the 
graph. Then 

(m^ los^ n \ 

Proof: Corollary of the previous lenrnia. □ 

In fact Lemma 5 implies the stronger result that the time needed for p parallel random 
walks to traverse every edge in the graph is O (np- log^ n / p^) . 

3 An Algorithm for USTCON in O (p) Space 

We now present the algorithm for USTCON using O (p) space. As a subroutine, we 
use a standard Union/Find algorithm. 
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algorithm stConn; 
begin 

(* ^1 , ^3, and ^4 are suitably large constants *) 
do k4 Inn times begin 

Let L be a set of p elements of V, chosen independently 

at random according to the stationary distribution; 
L:=L\J{s,t}; 

Construct a perfect hash function for the elements of L; 
for every / in L do Set{l) := /; 
do ks Inn times begin 

for every / in L do begin 

Take a random walk Xi{T) of length fcim^ In^ n/p^ 
from /; 

At each step, if Xi{T) e L then 
Union{Find{Xi{t)), FindQ)); 

end ; 
end ; 

if Findis) — Find(t) 

then return ("YES: s and t are coimected") 

end ; 

return ("NO: s and t don't seem to be connected") end . 
Theorem 7 The algorithm stConn runs in time O (m^ log^ n/p) using space O (p). 

Proof: Choosing a random set of p vertices according to the stationary distribution 
can be done in 0(m) steps using 0(p log n) random bits and 0{p) space. Constructing 
a perfect hash function for storing L requires expected time 0(p) [6]. If the unions 
are weighted and each union causes path compression on all elements of the set, then 
each find has cost 0(1). Since at most 0(n) non-trivial unions are performed, the 
cost of all the unions is 0{n logn). Performing all (9(logn) random walks of length 
0{m^ log^ n I p^) takes time O (irP' log'* n / p^) per leader for a total of 0{m^ log"* n/p) 
time. Since this is also the total number of finds and lookups performed, this is the 
running time of each execution of the outermost loop. □ 

Note that this algorithm is easily parallelizable using p processors and O (p) space. The 
parallel hashing scheme described in [7] can be used to implement a parallel version of 
this algorithm that runs on p processors, n'^ < p < n'~^, € > 0, that are connected by 
a bounded degree network. Briefly, storing the leader set using parallel hashing allows 
for the p processors to execute parallel unions and parallel finds in time 0{p^') for any 
e' > 0, and consequently the random walks from each of the leaders can be executed 
in parallel. The resulting parallel implementation of the stConn algorithm runs in time 
0(m^+''/p^). 
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4 The Correctness of stConn 



Because our algorithm has one-sided error, it suffices to analyze its correctness in the 
case when s and t are in the same component of G. If G is actually coimected, the 
results of section 2 show that, in one pass through the irmer loop of stConn, every 
edge is traversed with high probability. From this, it is possible to deduce that every 
leader either discovers or is discovered by some other leader. As mentioned earlier, 
however, this is not enough to prove that s and t become linked by a chain of leaders 
after O (log n) passes through this inner loop, since it may be that certain leaders always 
discover each other. The rest of this section shows that s and t will be "linked up" with 
high probability by the algorithm. 



Theorem 8 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p leaders, each chosen at random according to the stationary 
distribution. Then for any ci > 0 there is a constant C2 > 0 such that 

Pr(L n 5[„,^](c2m2ln3n//) ^ 0) > i _ _L^ 

where 5[„,i„](r) denotes the set of distinct vertices visited in a T step random walk 
starting at \v, w]. 



Proof: The proof is very similar to that of Lemma 5. As before the case p = 0(logn) 
is trivial. 

Let e be a directed edge chosen uniformly at random. By a proof virtually identical to 
that of Lemma 3, 

Pr(e e (r)) > . , ^ -. 

l + E(Ar,(e, t)) 

Obviously, if e is chosen uniformly at random then 

T 



By Lemma 4 



E(iV[v,«;](e, t)) = 



E{Ne{e, t)) < hfeVrlnn. 

2m 



Hence, for e chosen uniformly at random, there exists a constant cj such that 

Pr(e e (C2m In" n/p )) > <^3 — > 
provided that P = J2(logn). 

In order to choose a leader according to the stationary distribution, one can choose a 
directed edge e uniformly at random and let the leader be the head of e. Since the 
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probability of reaching a leader is greater than the probability of traversing the edge 
chosen to determine it, we obtain that 

Pr(Ln5[„,^](c2m2ln3n//) = 0) =(1 - Pr(e ^ (czm^ In^ < 

for a sufficiently large C2 . □ 

Corollary 9 Let G be a connected, undirected graph with n vertices and m edges. 
Let Lbea set of p leaders chosen at random according to the stationary distribution. 
Then for any ci > 0 there is a constant C2 > 0 such that 

Pr(L n SsicW \r^n/p^)i^<S)>\ - —. 

and 

Pr(L n S,{c2m^ \v? n/p^) / 0) > 1 - — . 

n"^ 

□ 

Let L be any set of p leaders. We say the set L is good if for an absolute constant k\ 
the following two properties hold: 

1. The probability that a set of p independent random walks of length fciw^ In^ n/ p^, 
one from each leader in L, traverses every edge in G is at least 1 — 1 /n^ . 

2. For every edge [d, w] e G, the probability that a random walk of length 
k\m^ In^ n/ p^ starting from \v, u)] visits some leader in L is at least 1 — 

Lemma 10 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p leaders chosen uniformly at random according to the stationary 
distribution. Then Pr(L is good) > 1 — 

Proof: Say that a set of random walks, one from each of the leaders, is unsuccessful 
for [v, w] if [d, w] is not visited by any of them. Letting c\ = 6 in Lenrnia 5 , we 
see that at most 1 /n^ of the possible leader sets can have probability greater than 1 /n^ 
of yielding unsuccessful random walks for any fixed [v,w\. Similarly, letting ci = 6 
in Theorem 8, we see that at most 1 /n^ of the possible leader sets have probability 
greater than 1 /n^ of remaining undiscovered in a random walk of length r from any 
fixed edge [d, w\. The probabihty that a leader set is not good is bounded by the sum 
of the probabilities that it isn't good because it violates properties 1 or 2. Since there 
are less than edges, the probability that a leader set is bad is bounded by 1 /n . □ 

Lemma 11 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p leaders chosen uniformly at random according to the stationary 
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distribution. Suppose that L is a good set of leaders. Let A and B be a partition of 
L into two nonempty subsets. Consider a random walk of length 2t from each of the 
leaders in L. Then the probability that some leader in A is visited from some leader in 
B or vice versa is greater than 1/18. 

Proof: (Unless stated otherwise, all edges referred to in this proof are directed.) We 
assign to each edge in the graph two labels: a "To" label T and a "From" label F. These 
labels are subsets of the set {A, B). By definition, A e T(e) (respectively B e T(e)) 
if the probability that e is visited by a walk of length r emanating from each leader in 
A (respectively walks from leaders in B) is greater than 1/3. Analogously, A e F{e) 
(respectively B e F(e)) if the probability that some leader in A (respectively B) is 
visited in a random walk of length t starting from e is at least 1/3. 

Properties 1 and 2 of good leader sets imply that for each edge neither label is empty. 
We now consider four cases: 

1. There is some edge [u, w] with A e F{[v, w]) and B e T{[v, w]) or vice versa. 

Then with probability 1/3 edge [v, w] is visited by one of the random walks of 
length T originating in A and with probability 1/3 a leader in B is visited in the 
remaining at least r steps. Hence, with probability at least 1/9 a leader in B is 

visited from a leader in A. 

After eliminating this case the only remaining possibility is that for every edge 
F([v, w]) = Ti[v, w]) = {A} or F([v, w]) = Ti[v, w]) = [B] 

2. There is some undirected edge {v, w] such that F{[v, w]) — T{[v, w]) — {A}, 
and Fi[w, v]) = Ti[w, v]) = {B}. 

Then with probability > 1/3, [v, w] is visited by one of the walks of length r 
originating in A and hence the vertex v is visited by one of these walks with 
probability at least 1/3. Since a leader in B is visited from [w, v] in r steps with 
probability > 1/3, a leader in B is visited from u in r steps with proability > 1/3. 
Hence with probability at least 1/9 a leader in B is visited from a leader in A. 

3. No label in the graph contains A or no label in the graph contains B. 

Without loss of generality, consider the first of the two conditions. Then every 
edge directed towards leaders in A, has a "To" label of B. Therefore, with 
probability 1/3, each such edge is visited by one of the random walks of length r 
originating at B and a leader in A is immediately visited. Hence, with proability 
at least 1/3, a leader in A is visited from a leader in B. 

4. For every undirected edge {v, w], we have either T{[v, w]) = F{[v, w]) = 
T{[w, v]) — F([w, v]) — [A] or we have T{[v, w]) = F{[v, wj) = 

Ti[w, V]) = Fi[w, V]) = {B} 

Since case 3 does not hold and the graph is connected, there must be a vertex v 
that is simultaneously the endpoint of some all- A labeled edge and some all-B 
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labeled edge. Assume without loss of generality that at least 1/2 of the undirected 
edges with one endpoint at v have all their labels equal to B. Then since some 
edge [w, v] has an A T-label, with probability at least 1/3 v is visited in the first r 
steps of the random walks originating at A. Since the majority of edges leaving v 
have a B F-label, with probability at least 1/2 one of these edges will be traversed 
and then with probability at least 1/3, a leader in B will be reached during the 
remaining at least r steps. Hence with probability at least 1/18 a leader in B is 
visited from a leader in A. 

□ 

We say that a subset of leaders forms a component, if during some prior phase of 
the algorithm, they have all been connected up with one another. During a particular 
phase, we say that a component C is successful if it discovers some other component or 
some other component discovers it. The previous lemma proves, that if the leader set 
is good, every component has probability at least 1/18 of being successful. The next 
lemma shows that the number of separate components decreases exponentially with the 
number of phases. 

Lemma 12 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p leaders chosen uniformly at random according to the stationary 
distribution. Suppose that L is a good leader set. Let Ni be the number of components 
after the ith phase. Then there exist constants a and p, with 0 < a, p < I, such that if 
Ni > 1 then 

Pr(Ar,+i > pNi) < a. 

Proof: Plainly, Ni+i equals Ni minus the number of nonredundant links formed in 
phase i . Since the number of such links formed in phase i exceeds one half the number 
of successful components, and the previous lemma shows that the probabiUty that a 
component is successful is at least 1/18, 

E(number of links formed in phase 0 > — ^ — A', . 

2 • 18 

Hence, 

E(M+i)<(l-4w- 
and so there is a positive constant yS < 1 such that 

Pr(Ar,+i > pNi) < a. 

□ 

Lemma 13 Let G be a connected, undirected graph with n vertices and m edges. 
Let L be a set of p leaders chosen uniformly at random according to the stationary 
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distribution. Suppose that L is a good leader set. Let N, be the number of components 
after the ith phase. Then for any constant c\ > 0, there is a constant C2 > 0 such that 

Pr{Nc2inn > 1) < ^• 

Proof: We say that a phase is successful if Ni+i < PNi. Since the leader set is fixed 
and good, successive phases are independent (the random walks are independent), and 
by the previous lemma, phase i has probability greater than I— a of being successful for 
each i. But the probability that in « is greater than one is bounded by the probability 
that there are fewer than Ini/^ n successful phases out of C2 Inn phases. This in turn is 
bound by the probability that there are fewer than Ini/^n successes in C2 Inn Bernoulli 
trials with probability greater than 1 — a of success, which by Chemoff 's bound is less 
than l/n'^\ for appropriately chosen C2 . □ 

Theorem 14 The algorithm stConn decides USTCON using space 0{p) and time 

O {{m^ log^ n)/ p) with one-sided error. Ifs and t are in the same connected component, 
the algorithm fails to output YES with probability 0(n~'); ifs and t are in different 
components, it outputs NO. 

Proof: If the graph consists of a single connected component, then we need only 
consider one execution of the outer loop of the algorithm, wherein the algorithm can 
fail to output YES when it should if either the leader set is not good, or the leader set 
is good, but the number of components did not reduce to 1. By Lemma 10, the former 
has probability at most \/n and by Lemma 13 the latter, when choosing the constant k'i 
appropriately, has probability at most 1 /n and so the theorem follows in this case. 

The other case is when s and f are in a single component C containing h vertices and in 
edges. If m^l p^ > mh, then in Inn random walks of length kim^ In^ n/ p^ starting 
from s, the vertex t will be seen with overwhelming probability, since the expected 
cover time of the component is bounded by mh. 

Otherwise, if m^/ p^ < mh, the algorithm can fail to output YES when it should if 
either none of the cq In n selections of leaders include enough leaders that are in the 
component C or if some selection of leaders includes enough leaders in C, but the 
associated random walks do not succeed in connecting s to t. For the latter case, we 
observe that, in each of the cq Inn executions of the outer loop of the algorithm, the 
expected number of leaders that are chosen from Cis p = pm/m. If 0{p) leaders are 
indeed chosen from C, then since 

cam^ In^ n c^m^ In^ n 

^ = 2 = ^ ' 

p-^ 

the analysis given for a single connected graph on h vertices and m edges with p leaders 
yields a failure probability of O (« ~ ' ) . To bound the probability that none of the leader 
selections are good, we note that the probability that fewer than p /2 leaders are chosen 
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from C is bounded by exp{—(pm)/h) < c, for some constant c < 1. Therefore, the 
probability that less than p/2 leaders are chosen from C in every one of the ^4 Inn 
executions of the outermost loop is bounded by O (n ~ ^ ) , f or a sufficiently large constant 
k4. □ 

5 Open problems 

Can the bound on the parallel cover time given in Theorem 1 be improved? Note that 
we bound the cover time for all vertices by bounding the cover time for all edges. It is 
not clear that this is necessary. 

Theorem 2 shows that for p slightly larger than the average degree m/n, our algo- 
rithm runs faster than the random walk. Devising an algorithm that runs in time 

O {mn n/ p) is perhaps the most interesting open problem. 

There is no fundamental reason why our upper bound is the best possible. We thus 
hope that this work will spark interest in proving a time-space tradeoff for USTCON, 
even in a restricted model of space-bounded computation such as the JAGs of Cook 
and Rackoff [5]. For a restricted version of the JAG model, Beame et al. [2] have 
shown that space p implies time n (n^/ip log n)) for bounded-degree graphs. 
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